14.4 Stochastic Encoders and DecodersΒΆ
Given a hidden code h, we may think of the decoder as providing a conditional distribution \(p_{decoder}(x|h)\). We may train the autoencoder by minimizing \(-lpg P_{decoder}(x|h)\).
- x is Gaussian, negative log-likehood yield mean squared error
- x is Bernoulli, yield softmax
See p129 5.5.1 for review